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Abstract 

Estimation  of  statistical  moments  from  simulation,  i.e.,  mean  and  standard  deviation  of 
an  output,  may  involve  large  uncertainty  caused  by  the  variability  in  the  input  random 
variables.  The  allocation  of  resources  to  obtain  more  experimental  data  can  reduce  the 
variance  of  the  output  moments  (mean  and  standard  deviation).  The  methodology  proposed 
and  executed  used  an  optimization  method  to  determine  the  optimal  number  of  additional 
experiments  required  to  minimize  the  variance  of  the  output  moments  given  a  constraint.  A 
method  to  generate  the  output  moments  based  on  the  moments  of  the  input  variables  was 
implemented.  The  method  used  the  multivariate  t-distribution  and  the  Wishart  distribution 
to  generate  realizations  of  the  population  mean  and  population  covariance  of  the  input 
variables,  respectively.  This  method  was  sufficient  to  handled  independent  and  correlated 
variables.  A  fretting  fatigue  problem  was  explored  to  minimize  the  variance  of  cycles-to- 
failure  mean  and  standard  deviation.  The  optimal  number  of  additional  experiments 
required  for  each  random  variable  depended  on  the  number  of  initial  data  points,  the 
influence  of  the  variable  in  the  output  function,  the  cost  of  each  additional  experiment  and 
the  variance  of  the  sample  mean. 


Nomenclature 


4 

Constants  in  the  output  function  for  variable  X  ■ 

b 

Funds  available  for  additional  experiments 

CGi  = 

Cost  of  each  additional  experiment  for  group  (j. 

Q,  - 

Cost  of  each  additional  experiment  for  variable  X ■ 

A*  - 

Number  of  additional  experiments  for  group  G- 

DX,  = 

Number  of  additional  experiments  for  variable  X. 

EG,  = 

Number  of  initial  data  points  for  group  (7  . 

Ex,  = 

Number  of  initial  data  points  for  variable  X- 

G: 

Group  of  random  variables  X 

Go,  = 

Values  of  additional  experiments  for  group  (7  . 

G*,  ~ 

Values  of  initial  data  points  for  group  G  t 

gbestx= 

Best  position  encountered  by  any  particle  in  PSO  (Global  best) 

k 

Optimization  iteration  number 

MCS  = 

Monte  Carlo  sampling 

Nf  - 

Cycles-to-failure 

nx, 

Number  of  total  data  points  for  variable  X i 
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Multivariate  normal  distribution 
Number  of  variables  in  the  output  function 

Best  position  found  by  ith  particle  in  all  previous  iterations  in  PSO  (Personal  best) 

Probability  density  function 

Particle  swarm  optimization 

Individual  weight  in  PSO 

Social  weight  in  PSO 

Number  from  uniform  distribution  between  0  and  1  for  the  i'h  PSO  particle 
Sample  standard  deviation  for  variable  X- 

Sample  variance  for  variable  Xt 

Univariate  Student’s  t-distribution  with  n-1  degrees  of  freedom 
Multivariate  t-distribution  of p  variables  with  n-1  degrees  of  freedom 

Velocity  of  i,h  PSO  particle  at  iteration  k 

Velocity  of  i,h  PSO  particle  at  iteration  k  +  1 
Inertia  weight  in  PSO 
Wishart  distribution 

Values  of  X  t  for  additional  experiments  Dx 
Values  of  X .  for  initial  data  points  E  ^ 

Input  variable  i,h  in  the  output  function 
Observation  f‘  of  the  X i  input  variable 
Sample  mean  of  variable  X  ■ 

Position  of  i,h  PSO  particle  at  iteration  k 
Output  function  (or  response  function) 

Chi-square  distribution 

Input  population  mean  of  variable  X . 

Output  mean 

Correlation  coefficient  between  X.  and  X  ■ 

Covariance  matrix 

Input  population  standard  deviation  of  variable  X i 

Input  population  variance  of  variable  Xi 

Standard  deviation  of  the  output  mean 

Standard  deviation  of  the  output  standard  deviation 

Standard  deviation  of  the  output  mean  based  on  the  original  data 

Standard  deviation  of  the  output  mean  based  on  the  optimum  solution 

Standard  deviation  of  the  output  std.  dev.  based  on  the  original  data 

Standard  deviation  of  the  output  std.  dev.  based  on  the  optimum  solution 

Output  standard  deviation 

Output  variance 

2 

American  Institute  of  Aeronautics  and  Astronautics 


Approved  for  public  release;  distribution  unlimited 


£ 

Q 


Sample  covariance  matrix 
Data  set 


I.  Introduction 

THE  presence  of  uncertainty  in  risk  and  reliability  analysis  is  unavoidable;  it  is  an  important  part  of  the  planning, 
executing,  and  decision-making  process.  To  develop  estimates,  researchers  must  rely  on  available  data  that  is 
often  limited  and  contains  variability.  Moreover,  they  have  to  rely  on  estimation  or  predictions  based  on  idealized 
models  that  involve  additional  uncertainty  compared  to  reality  [1].  Statistical  estimates  from  simulation,  such  as 
mean  and  standard  deviation  of  the  output  or  the  probability-of-failure  (probability  of  exceeding  a  limit),  often 
involve  significant  uncertainty  caused  by  the  variability  in  the  input  random  variables.  The  probability  distribution 
of  the  input  variables  may  be  developed  from  the  limited  available  data,  thus  the  sample  mean  and  standard 
deviation  of  the  input  are  also  random  variables  dependent  upon  the  sample  size.  The  resulting  uncertainty  in  the 
estimated  output  moments  can  be  significant  and  should  be  taken  into  account  when  making  decisions  [2]  [3]  [8] 
[10]  [11]  [13]  [14], 

Previous  research  has  focused  on  the  quantification  of  uncertainty  caused  by  the  variability  of  the  input  random 
variables  using  confidence  intervals  of  the  output  model.  Numerous  authors  have  developed  methods  to  calculate 
these  confidence  intervals.  On  the  other  hand,  less  work  has  been  done  to  reduce  the  uncertainty  of  the  output 
model,  and  a  very  limited  number  of  authors  have  tried  to  increase  the  confidence  of  the  output  model  by  allocating 
resources  to  obtain  more  experimental  data  of  the  input  random  variables. 

Several  methods  have  been  proposed  to  accomplish  the  computation  of  the  confidence  intervals  of  the  statistical 
estimates  (the  output  moments  or  the  probability-of-failure).  Most  authors  have  developed  methods  to  estimate  the 
probability-of-failure  using  a  first-order  reliability  method  (FORM)  [  1 0] [  1 1][8][14].  FORM  estimates  the  shortest 
distance,  known  as  reliability  index,  from  the  origin  of  a  standard  normal  variable  space  to  a  design  point  (most 
probable  point)  on  a  limit  state.  The  limit  state  is  the  boundary  between  the  safe  and  unsafe  region  [11].  The 
uncertainty  present  in  the  distribution  of  the  input  parameters  is  quantified  by  obtaining  the  confidence  intervals  of 
the  reliability  index  (or  safety  index). 

The  methods  to  calculate  the  confidence  intervals  of  the  safety  index  in  reliability  analysis  are  accepted  for  many 
problems  if  the  most  probable  point  can  be  located,  and  the  limit  state  function  (boundary  between  the  safe  and 
unsafe  regions)  can  be  approximated  with  a  surface  of  first-  or  second-order.  A  more  general  method  to  obtain  the 
reliability  or  probability-of-failure  is  using  Monte  Carlo  simulation  (MCS),  and  some  authors  have  done  rigorous 
studies  on  this  matter  [4]  [7]. 

The  most  recognized  strategy  to  determine  the  influence  of  the  input  parameter  variation  on  the  output  model  is 
accomplished  by  nesting  a  loop  of  a  single  output  calculation  within  a  loop  that  accounts  for  the  uncertainty  of  the 
input  parameter.  The  loop  where  the  output  moments  are  computed  is  often  referred  to  as  the  “inner-loop,”  and  the 
loop  where  the  variation  of  the  input  parameter  is  taken  into  account  is  usually  referred  to  as  the  “outer-loop.”  The 
limitation  of  this  method  is  the  high  computational  cost  occasioned  by  the  nested  loops.  The  accomplishment  of  the 
nested  simulation  can  represent  a  non-trivial  problem,  and  several  authors  have  developed  computational  strategies 
to  address  this  issue.  The  common  strategy  involves  using  a  surrogate  model,  such  as  a  response  surface,  to 
approximate  the  probability-of-failure  as  a  function  of  the  input  moments  [2][3].  However,  the  accuracy  of  the 
statistical  estimates  depends  on  the  quality  of  the  surrogate  model. 

Several  authors  have  studied  the  variation  of  statistical  estimates  from  simulation,  such  as  mean  and  standard 
deviation  of  the  output  or  the  probability-of-failure,  in  the  presence  of  uncertainty  in  the  input  parameters.  They 
have  tried  to  quantify  the  variation  and  delimit  the  reliability  with  confidence  intervals.  Most  of  the  authors  have 
agreed  and  developed  strategies  to  address  the  computational  complexity  by  using  surrogate  models  to  replace  the 
inner-loop  in  the  nested  reliability  analysis;  however,  very  limited  research  has  been  done  to  increase  the  confidence 
in  the  output  model  by  taking  any  actions  over  the  input  parameters,  such  as  allocating  resources  to  obtain  more 
experimental  data  of  the  input  random  variables. 

Urbina  et  al.  [13]  implemented  a  hierarchical  approach  to  minimize  the  mean  and  the  range  of  the  probability-of- 
failure  by  allocating  resources  to  obtain  additional  experimental  data  of  the  input  variables.  The  input  parameters 
were  obtained  from  an  empirical  cumulative  distribution  function  developed  from  the  observed  data.  These 
parameters  were  introduced  into  a  Bayesian  network  to  obtain  the  system  response.  The  system  response  was 
compared  to  an  expected  performance  measure  to  calculate  the  probability-of-failure.  A  multi-objective  optimization 
problem  was  solved  using  a  grid  search  approach  and  the  constraint  was  a  function  of  cost  of  the  additional 
experimental  data. 
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The  main  purpose  of  this  research  was  to  develop  a  methodology  to  reduce  the  variation  of  an  output  moment 
using  an  optimization  algorithm  that  varied  the  number  of  data  points  obtained  for  each  input  variable.  Another 
purpose  was  to  develop  a  method  to  generate  realization  of  the  population  mean  and  population  covariance  of  the 
input  random  variables.  The  method  required  to  handle  independent  and  correlated  variables.  The  optimal  allocation 
methodology  aimed  to  find  the  optimal  additional  experimental  data  needed  to  better  characterize  the  moments  of 
the  input  probability  density  functions  (PDFs)  in  order  to  minimize  the  variance  of  the  output  moments,  such  as 
mean  and  standard  deviation,  subject  to  a  constraint.  The  methodology  combined  a  single-objective  optimization 
algorithm  with  a  nested-loop  arrangement.  The  output  moments  were  calculated  analytically  for  efficiency;  however 
the  methodology  is  not  limited  to  such  models. 

II.  Simulation  of  Statistical  Moments 

The  sample  mean,  Xx, ,  and  sample  variance,  Sx ,  become  random  variables  when  sampled  multiple  times  from 

the  same  population.  A  probabilistic  distribution  of  the  sample  moments  may  be  obtained  from  these  multiple 
samples.  These  distributions  are  known  as  sampling  distributions  [15]  and  are  used  to  simulate  the  input  population 
mean  and  standard  deviation. 


A.  Population  mean 

In  general,  the  standard  deviation  of  the  population,  ax  ,  is  unknown  and  needs  to  be  estimated  with  the  sample 


standard  deviation,  Sx  ■  As  a  result,  the  random  variable  (Xx,  -  p . .  (Sx  /./«,.  follows  a  t-distribution  as 

1  1  '' )!  x>  •  »  1 . 


Xx,  ~Px, 


V,  -i 


(1) 


where  tn  is  the  Student’s  t-distribution  with  nx  -1  degrees  of  freedom  [1].  Consequently,  realizations  of  the 
population  mean,  px  ,  can  be  determined  as 

Sx 

/A,  ~  Xxi  ~~  tnx.  -1  I  (2) 

inx, 


In  the  case  of  correlated  variables,  the  procedure  is  to  generate  realizations  of  the  multivariate  t-distribution  and 
compute  the  population  mean,  px  ,  as  follows 


HXj  Xxt  tp.,„x,-\ 


(3) 


where  Xx  is  the  sample  mean  of  variable  Xr  Sx  is  the  sample  standard  deviation,  nx  is  the  number  of 
observations,  and  t  is  the  f'  realization  from  the  o-variate  t-distribution  with  n  -1  degrees  of  freedom. 

Pi  X  i  ' 

location  vector  zero  and  scale  matrix  A  In  this  approach,  it  is  assumed  that  the  number  of  data  points  is  the  same  for 
all  random  variables  that  are  correlated.  The  sample  covariance,  £,  is  ap  x  p  matrix  calculated  as  follows 


C  =  ^-^{x^-x){x^-xy 


(4) 


where  X<J}  is  the  jth  observation  of  the  X  input  variable  with  /  =  and  j  =  1 The  sample  covariance,  £, 
is  the  unbiased  estimator  of  the  covariance,  X, 

B.  Population  Variance  and  Covariance  Matrix 

Similarly,  using  the  probability  distribution  of  the  sample  variance,  S\  ,  the  random  variable  (n  x  - 1)5^  j  O', 
has  the  following  distribution 
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(5) 


oi 


-1 


where  fa  j  is  the  chi-square  distribution  with  nx  -1  degrees  of  freedom  [1].  As  a  consequence,  realizations  of  the 
population  variance,  cTx  ,  can  be  determined  as  follows 


2 

Oy  = 


-i 


(6) 


In  the  case  of  correlated  variables,  the  input  population  covariance  matrix  was  obtained  using  a  Bayesian  approach 
A  commonly  used  prior  distribution  is 

P(^ 2)  =  -j^r 


H  2 


(7) 


where  |Z|  is  the  determinant  of  the  population  covariance  matrix  X.  Considering  Q  =  (X(]> ,X<2> ,...,X<n>)  to  be  the 
observed  data,  then  the  likelihood  p(Q |jU,E),  and  prior  distribution  p(fiX),  determine  the  joint-distribution 
p{Q,pX)-  From  this,  the  posterior  distribution  of  the  unknown  parameters  p(p,X\Q)  can  be  obtained.  The  Bayesian 
approach  prescribes  the  use  of  the  posterior  distribution  to  make  inference  about  unknown  parameters,  so  in 
particular  it  can  be  used  to  simulate  values  for  the  unknown  means,  and  covariance.  From  the  above  models  the 
inverse  of  the  population  covariance  matrix,  X-1,  conditioned  on  the  data  £2  has  the  following  distribution 


(2-'l  Q)~Wp{((nXi  -l) 


(8) 


where  WM(nx  -l)^j  ,nx  —  lj  represents  the  Wishart  distribution  with  nx  —  1  degrees  of  freedom,  and  scale 

matrix  ^(nx  -1)^  •  To  obtain  the  population  covariance  matrix,  X,  it  is  necessary  to  sample  from  the  Wishart 
distribution  given  by  Eq.  (8)  and  invert  the  values,  thus  yielding 


X-1  =  W, 


((»*,  -nx,  -l) 


(9) 


III.  Optimal  Allocation  of  Resources 

Different  approaches  have  been  studied  for  quantifying  the  uncertainty  in  the  statistical  estimates,  such  as  mean 
and  standard  deviation  of  the  output  or  the  probability-of-failure,  caused  by  the  variation  of  the  input  random 
variables.  To  date,  there  has  been  little  development  on  how  to  reduce  the  variation  of  the  output  moment 
distribution  by  taking  action  over  the  input  variables.  The  action  considered  in  this  work  was  to  add  additional  data 
or  experiments  to  the  input  variables  to  better  characterize  the  mean  and  standard  deviation  of  the  input  probability 
density  function  (PDFs).  The  methodology  aims  to  determine  the  optimal  number  of  experiments  required  to 
minimize  the  variance  of  the  output  moments  given  a  constraint.  The  methodology  can  also  be  defined,  as  what 
experiments  should  be  conducted  in  order  to  improve  the  confidence  in  the  output  moments  of  a  probabilistic 
problem. 


A.  Methodology 

A  schematic  of  the  computational  approach  is  shown  in  Figure  1.  The  methodology  proposed  is  general  and  can 
be  applied  in  any  field  where  a  reduction  of  the  variance  of  a  statistical  estimate  is  required. 


5  Personal  communication  with  Dr.  Victor  Dc  Oliveira,  Associate  Professor  at  the  Department  of  Management  Science  and 
Statistics  at  the  University  of  Texas  at  San  Antonio,  Texas 
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Figure  1.  Schematic  flow  chart  of  optimal  allocation  methodology 


In  practice,  the  determination  of  the  distribution  of  the  output  moments  are  often  computed  using  Monte  Carlo 
sampling.  The  sampling  requires  an  iterative  repetition  of  actions  called  a  “loop.”  This  loop  is  often  within  another 
cycle  of  actions,  therefore,  it  is  called  the  “inner-loop.”  The  inner  loop  generates  realizations  of  the  output  moments, 
such  as  mean  or  standard  deviation.  The  “outer-loop”  determines  the  distribution  and  standard  deviation  of  the 
output  moments.  The  approach  considered  in  this  research  was  to  use  an  optimization  model  combined  with  the 
nested-loop  arrangement  to  minimize  the  standard  deviation  of  the  output  moments. 

Every  computation  of  the  nested-loop  is  an  iteration  of  the  optimization  process.  In  every  iteration,  random 
numbers  of  additional  experimental  data  are  tested;  the  outcome  of  each  iteration  is  the  lowest  value  of  the  standard 
deviation  of  the  output  moment.  The  optimization  process  is  repeated  until  the  number  of  iterations  is  reached.  The 
final  result  of  the  optimization  is  the  optimal  additional  experiments  that  returned  the  minimum  value  of  the  standard 
deviation  of  the  output  moment.  This  method  is  a  single-objective  optimization;  only  the  standard  deviation  of  one 
output  moment,  such  as  output  mean  or  output  standard  deviation,  can  be  optimized  at  a  time. 

The  constraint  of  the  optimization  model  is  VP  (7  D,.  <  b-  Where  Cy  is  the  cost  of  each  additional  experiment, 

^^i=l  Xj  X  i  A  i 

Dx  is  the  number  of  additional  experiments  of  variable  X..  and  b  is  the  total  funds  available.  The  statistical  process 
to  minimize  the  standard  deviation  of  the  output  moment  subject  to  the  constraint  is  explained  as  follows: 


1 .  Initial  data  of  input  variable  XE  are  provided 

2.  Additional  experimental  data,  X n  ,  is  generated 

3.  The  input  sample  mean,  Xxl ,  and  input  sample  covariance,  £,  are  calculated 

4.  Realizations  of  the  p- variate  t-distribution,  t  ,  and  Wishart  distribution,  W  ,  are  used  to  simulate  the 
population  mean,  jux  ,  and  population  covariance,  X,  as  shown  in  Eq.  (3)  and  Eq.  (9),  respectively  (“outer- 
loop”) 

5.  According  to  the  objective,  the  output  mean,  fi7,  or  output  standard  deviation,  tj/:  is  calculated  (“inner- 
loop”) 
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6.  Steps  3  and  4  are  repeated  to  generate  a  distribution  of  the  output  moment 

7.  The  standard  deviation  of  the  output  moment  is  calculated 

8.  The  optimization  algorithm  varies  the  number  of  additional  experimental  data  (Step  2)  and  determines  the 
optimal  number  of  additional  experiments,  Dx 

B.  Optimization  Method 

The  optimization  problem  of  determining  the  subsequent  experiments  needed  to  reduce  the  variance  of  the 
output  moment  subject  to  a  cost  constraint  is  formulated  as  follows: 

Objective  :  Minimizeia  ,<ra  ) 

Constraint :  II  CxDXi  <b 
Variable  Bounds :  D'™r  <  Dx  <  Dfper 

where  a  is  the  standard  deviation  of  the  output  mean,  <7  is  the  standard  deviation  of  the  output  standard 
deviation,  Cx  is  the  cost  of  each  additional  experiment,  £)  is  the  number  of  additional  experiments,  b  are  the 
funds  available,  and  Dx'\  Dxper  are  the  lower  and  upper  bounds  of  the  additional  experiments  Dx ,  respectively. 
jyjpper  js  obtained  by  dividing  the  total  funds  available  by  the  cost  of  each  additional  experiment,  and  Dx'wer  is  zero. 
This  optimization  is  single-objective;  therefore,  only  the  standard  deviation  of  the  output  mean,  a  ,  or  the  standard 

Mz 

deviation  of  the  output  standard  deviation,  cx  ,  is  minimized. 

The  optimization  of  a  non-linear  function  of  integer  variables  and  the  high-computational  cost  associated  with  a 
function  evaluation  suggests  that  a  population-based  approach  is  suitable  to  solve  the  problem.  A  particle  swarm 
optimization  (PSO)  was  selected  because  of  the  ease  of  implementation  and  the  lower  user  parameters. 

Particle  swarm  optimization  (PSO)  is  a  population-based  method  used  in  the  optimization  of  non-linear 
functions;  it  was  proposed  in  1995  by  Kennedy  and  Eberhart  [6].  PSO  is  a  swarm  intelligence  method  that  models 
social  behavior  of  a  population  (swarm)  of  agents  (particles)  interacting  to  find  a  simulated  target  on  a  search  space. 
In  the  particle  swarm  optimization  process,  the  velocity  and  position  of  each  particle  is  iteratively  adjusted  as  shown 
in  Eq.  (10)  and  Eq.  (11),  respectively 

vf+1  =  wv*  +qlrn(pbestxi  - xf)  +  q2ri2(gbestx  -  xf)  (10) 

&+1  k  .  k+ 1  /i  1  \ 

X;  =Xj  +V(.  (11) 

The  velocity  is  defined  as  a  change  in  magnitude  of  the  design  variable  from  one  iteration  to  another,  and  the 
position  is  described  as  the  design  variable  unit,  in  this  research,  as  the  number  of  additional  experimental  data  Dx  ■ 

The  particles  move  according  to  a  communication  structure  thought  of  as  a  social  network.  At  iteration  k  the 
velocity  of  the  ith  particle  vk  is  updated  according  to  its  own  current  velocity  value,  the  best  position  encountered  by 
the  i,h  particle  in  all  previous  iterations  (particle  best,  pbestx),  the  best  position  encountered  by  any  particle  so  far 
(global  best,  gbestx),  and  the  inertia  weight,  w,  that  controls  the  impact  of  the  previous  velocity.  The  particles  are 
attracted  toward  the  positions  of  pbestx .  and  gbestx’,  the  strength  of  the  attraction  is  controlled  by  qi  (individual 
weight)  and  q2  (social  weight).  Randomness  is  introduced  for  good  space  exploration  via  r  and  rn  which  are 
random  numbers  from  a  uniform  distribution  on  the  interval  between  0  and  1 .  The  position  of  the  particle  is  updated 
using  its  current  position  value  xf  and  the  newly  computed  velocity  vk+X  [12].  The  constants  w,  qv  and  are 
empirical.  Trelea  et  al.  [12]  have  conducted  several  experiments  with  different  combinations  of  these  constants 
recommended  by  other  authors  and  concluded  that  the  best  results  published  are 

Inertia  weight  (w) :  0.729 
Cognitive  constant  (q) :  1 .494 
Social  constant  (q,) :  1 .494 
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The  number  of  particles  and  the  number  of  iterations  were  selected  according  to  the  complexity  of  the  problem. 
The  constraint  used  in  the  method  was  controlled  by  using  a  penalty  function,  where  a  significant  penalty  was 
assigned  to  the  objective  function  value  if  the  constraint  was  exceeded.  This  strategy  forced  the  particles  to  move 
towards  the  feasible  design  space  where  the  target  was  located. 

IV.  Fretting  Fatigue 

The  optimal  allocation  methodology  was  applied  to  a  fretting  fatigue  problem.  Fretting  is  the  wear  damage 
caused  when  a  material  is  compressed  against  one  another  in  the  presence  of  oscillatory  displacements.  The  wear 
and  high  local  stresses  cause  nucleation  of  cracks  that  reduce  the  fatigue  life.  Fretting  fatigue  is  a  major  problem  in 
the  aerospace  industry.  The  damage  that  occurs  from  fretting  fatigue  causes  structural  failure  that  may  be  very 
costly.  Previous  research  of  fretting  fatigue  has  been  done  by  Golden  et  al.  [5]  who  performed  a  probabilistic  fretting 
fatigue  life  prediction  analysis  of  Ti-6A1-4V  dovetail  specimens. 

The  statistical  data  given  in  reference  [5]  was  used  in  this  case  study.  The  fretting  fatigue  problem  consisted  of 
20  random  variables  with  mean,  standard  deviation,  and  correlation  values  determined  from  experimental  data.  The 
statistics  of  the  random  variables  are  shown  in  Table  1. 


Table  1.  Random  Variables  Statistics 


Random  Variable 

Variable 

No. 

Mean, 

Px, 

St.  dev., 

Distribution  Type 

Initial  Crack 

15.1 

8.48 

Lognormal 

Friction  Coeff. 

*2 

0.302 

0.021 

Correlated  Normal 

Partial  Slip  Slope 

*3 

1.96 

0.12 

p23  =  -0.375 

Crack  Growth 

*4 

-14.6 

0.486 

Correlated  Normal 

Crack  Growth 

7.19 

0.715 

p45  =  -0.9973 

Crack  Growth 

*6 

-11.8 

0.157 

Correlated  Normal 

Crack  Growth 

3.81 

0.146 

Ps 7  =-0.9751 

Pad  Profile 

*8 

0.181 

5.84E-03 

Pad  Profile 

*9 

-2335 

410 

Pad  Profile 

*10 

2333 

411 

Pad  Profile 

*11 

-1612 

37.7 

Pad  Profile 

*12 

2289 

379 

Pad  Profile 

*13 

1620 

35.4 

Pad  Profile 

*14 

-0.183 

4.96E-03 

Correlated  Normal 
p(see  Table  2) 

Pad  Profile 

*15 

-2.00E-04 

6.20E-04 

Pad  Profile 

*16 

-1.21E-06 

1.01E-06 

Pad  Profile 

*17 

1.53E-10 

6.38E-10 

Pad  Profile 

*18 

9.80E-13 

6.16E-13 

Pad  Profile 

*19 

-3.77E-17 

1.87E-16 

Pad  Profile 

*20 

-3.80E-19 

1.59E-19 
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Table  2.  Pad  Profile  Correlation  Coefficients 


*8 

*9 

*,o 

*11 

*12 

*13 

*14 

*15 

*16 

*17 

*18 

*19 

*20 

*8 

1 

-0.079 

0.078 

-0.119 

-0.051 

0.184 

-0.364 

0.153 

0.175 

-0.107 

-0.130 

0.073 

0.112 

*9 

-0.079 

1 

-1.000 

-0.921 

-0.404 

0.321 

-0.122 

0.093 

-0.205 

-0.203 

0.281 

0.275 

-0.246 

*10 

0.078 

-1.000 

1 

0.921 

0.404 

-0.321 

0.121 

-0.092 

0.207 

0.203 

-0.282 

-0.275 

0.247 

-0.119 

-0.921 

0.921 

1 

0.325 

-0.372 

0.270 

-0.019 

0.104 

0.136 

-0.201 

-0.247 

0.219 

*12 

-0.051 

-0.404 

0.404 

0.325 

1 

-0.899 

-0.204 

-0.102 

0.079 

0.065 

-0.116 

0.002 

0.123 

*13 

0.184 

0.321 

-0.321 

-0.372 

-0.899 

1 

-0.033 

0.010 

-0.004 

0.062 

0.021 

-0.147 

-0.061 

*14 

-0.364 

-0.122 

0.121 

0.270 

-0.204 

-0.033 

1 

-0.095 

-0.242 

-0.017 

0.142 

0.078 

-0.133 

*15 

0.153 

0.093 

-0.092 

-0.019 

-0.102 

0.010 

-0.095 

1 

0.140 

-0.876 

-0.027 

0.627 

0.059 

*16 

0.175 

-0.205 

0.207 

0.104 

0.079 

-0.004 

-0.242 

0.140 

1 

-0.099 

-0.869 

0.057 

0.765 

*17 

-0.107 

-0.203 

0.203 

0.136 

0.065 

0.062 

-0.017 

-0.876 

-0.099 

1 

0.011 

-0.915 

-0.027 

*18 

-0.130 

0.281 

-0.282 

-0.201 

-0.116 

0.021 

0.142 

-0.027 

-0.869 

0.011 

1 

0.027 

-0.944 

*19 

0.073 

0.275 

-0.275 

-0.247 

0.002 

-0.147 

0.078 

0.627 

0.057 

-0.915 

0.027 

1 

-0.031 

*20 

0.112 

-0.246 

0.247 

0.219 

0.123 

-0.061 

-0.133 

0.059 

0.765 

-0.027 

-0.944 

-0.031 

1 

Linear  regression  was  used  to  fit  a  predictive  model  of  the  form  log(AL)  =A0  +'J',AIXI  to  a  set  of  10,000  data 
points6,  where  Nf-  is  cycles-to-failure.  The  linear  regression  coefficients  Ao  and  A i  are  shown  in  Table  3.  The 
coefficient  of  determination  R2  =0.87,  thus  about  87%  of  the  variation  of  log(AL)  is  explained  by  the  predictor 
variables  in  the  model 

Table  3.  Linear  Regression  Coefficients 
Term  Estimate,  /L 


Intercept,  ( A0 ) 

4.93 

*i 

-4.10E+03 

*2 

-7.48 

*3 

2.52E-03 

*4 

-0.18 

*5 

-0.14 

*6 

-0.48 

*7 

-0.44 

*8 

0.06 

*9 

1.08E-04 

*10 

1.05E-04 

*11 

-1.95E-04 

6  Provided  by  Dr.  Patrick  J.  Golden  from  the  Materials  and  Manufacturing  Directorate,  Air  Force  Research  Laboratory,  Wright- 
Patterson  AFB,  OH 
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*12 

-5.60E-05 

*13 

-9.81E-04 

*14 

0.16 

*15 

8.47 

*16 

1.51E+05 

*17 

1.05E+08 

*18 

4.64E+1 1 

*19 

3.51E+14 

*20 

1.26E+18 

Once  the  predicted  model,  log(AG)  =  An  +AlXl  +  A2X2  + ...  +  A2QX20,  was  validated,  it  was  used  to  analytically 
obtain  the  moments  of  log(iV/  )•  The  mean  and  standard  deviation  of  log(jV^)  were  calculated  as  shown  in  Eq.  (12) 
and  Eq.  (13)  respectively 


MogW/)  Ao  AiMx,  T  AiPx2  AioMx20 

and 

20 

(7log(N/)=^XlMXlMAiAjTij 

where  fix  represents  the  mean  of  variable  X  and  X  represents  the  i‘h,j'h  value  of  the  population  covariance  matrix, 

2. 

Finally,  the  objective  was  to  determine  how  many  additional  experiments  were  needed  to  minimize  the  standard 
deviation  of  log((V.  )  moments,  (cr  ,cr  ),  given  funds  available. 

'  V  fkgim  al0g (Nf)J 

The  random  variables  were  partitioned  into  four  groups  according  to  the  correlation  of  the  random  variables.  The 
group’s  distribution,  cost,  and  initial  data  are  shown  in  Table  4.  The  number  of  additional  experiments  required  to 
minimize  the  standard  deviation  of  \og(Nf )  were  determined  by  group. 


(12) 

(13) 


Table  4.  Grouping  of  Random  Variables 


Initial  Data 

Group 

Random  Variable 

No. 

Distribution  Type 

Test  Cost  Cr 

Uj 

Eg, 

G, 

Initial  Crack 

*i 

Lognormal 

$846 

20 

g2 

Friction  Coeff/  Partial 
Slip  Slope 

*2  _*3 

Correlated  Normal 

p2i  =  -0.375 

Correlated  Normal 

$4,810 

17 

g3 

Crack  Growth 

*4  *7 

p45  =  -0.9973 

P6 7  =-0.9751 

$4,748 

198 

Pad  Profile 

*8  *20 

Correlated  Normal 

$919 

g4 

p{see  Table  2) 

77 

Case  C-l.  Minimize  a 

Mlog(Nf) 

A  preliminary  study  was  performed  before  utilizing  the  optimization  methodology,  in  which  the  total  funds 
available,  b  =  $20,000,  were  allocated  only  for  one  group  at  a  time.  The  maximum  number  of  additional 

experiments  for  each  group  was  calculated  as  DG  =|_ b/CG  \  The  mean  of  log((Vr/.  )  was  given  as 

7  Provided  by  Dr.  Patrick  J.  Golden  from  the  Materials  and  Manufacturing  Directorate,  Air  Force  Research  Laboratory,  Wright- 
Patterson  AFB,  OH 
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//log (jv  i  =  Ao+  Ax/ux  +  . ..  +  A20jux  ,  where  jux  was  calculated  using  Eq.  (3).  The  results  of  the  reduction  of  <T/( 
are  summarized  in  Table  5. 


Table  5.  Reduction  if  total  funds  were  spent  on  each  group 


Test  1 

Test  2 

Test  3 

Test  4 

A, 

23 

0 

0 

0 

A* 

0 

4 

0 

0 

A* 

0 

0 

4 

0 

A4 

0 

0 

0 

21 

<s 
i^log (Nf) 

0.0366 

0.0366 

0.0366 

0.0366 

< 

Mlog(  Nf) 

0.0354 

0.0311 

0.0365 

0.0362 

%  Reduction 

3% 

15% 

0.3% 

1% 

ECc  D0 

$19,458 

$19,240 

$18,992 

$19,299 

Each  experimental  data  of  G 
reduced  O  by 

Mlog(Nf) 

0.13% 

3.75% 

0.08% 

0.05% 

Group  G\  had  the  highest  reduction  in  <j  followed  by  Groups  G,,  G ,  and  G\.  If  one  additional  experiment 

7  PlogfHf)  l  4  J 

was  added  to  Group  G9.  Gp  Gv  and  Gr  the  reduction  would  be  3.75%,  0.13%,  0.08%,  and  0.05%,  respectively. 

Next,  the  optimal  allocation  methodology  was  applied  with  40  particles,  40  iterations.  Table  6  summarizes  the 
results  after  running  the  analysis  four  separate  times. 


Table  6.  Results  ( Case  C-l) 


Analysis 

1 

2 

3 

4 

A* 

0 

0 

0 

0 

A* 

4 

4 

4 

4 

A% 

0 

0 

0 

0 

A, 

0 

0 

0 

0 

<is 

Mog(«n 

0.036 

0.036 

0.036 

0.036 

C T°pt 

MlogfNft 

0.031 

0.031 

0.031 

0.031 

%  Reduction 

14% 

14% 

14% 

14% 

SC&,  D(  , 

$19,240 

$19,240 

$19,240 

$19,240 

In  all  cases,  the  standard  deviation  of  /ukni(N  .  was  reduced  by  approximately  14%  by  adding  4  experiments  to  Group 
Gr  The  PDF  of  //k)g(  v  .  is  shown  in  Figure  2.  The  red  area  with  black  dashed  line  represents  //log(JV  )  before  adding 
any  data  and  the  blue  area  represents  the  PDF  after  adding  the  optimal  experimental  data. 
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Figure  3  and  Table  7  shows  the  behavior  of  a  as  a  function  of  funds  available.  The  increase  in  funds 

rtog(»/) 

available  implied  more  possible  additional  experiments.  With  more  additional  experiments  the  reduction  of  <r 

Mog  W) 

was  higher.  The  decrease  of  <j  as  the  amount  of  funds  become  available  is  depicted  with  a  black  dotted  line, 

Mlog(Nf) 

and  the  pink  solid  line  shows  the  percent  reduction  of  ij u  after  adding  experimental  data  to  the  initial  data 

points. 


Funds  Available  ($) 

Figure  3.  Behavior  with  respect  to  funds  available 
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Table  7.  Results  for  different  funds  available 


Funds 

$20,000 

$40,000 

$80,000 

$100,000 

$200,000 

A, 

0 

1 

2 

7 

8 

A * 

4 

8 

16 

18 

34 

A * 

0 

0 

0 

0 

4 

At, 

0 

0 

0 

8 

11 

(fris 

Aogciyn 

0.036 

0.036 

0.036 

0.036 

0.036 

< 

Aog  (Nf) 

0.031 

0.029 

0.027 

0.026 

0.023 

%  Reduction 

14% 

19% 

26% 

28% 

36% 

zcGpGi 

$19,240 

$39,326 

$78,652 

$99,854 

$199,409 

Case  C-2.  Minimize  cr 

alog(Nf) 

In  this  case,  the  objective  was  to  minimize  the  standard  deviation  of  logfN  J  standard  deviation,  cr  .  The 

J'  aiogim 

standard  deviation  <Jlog^N^  was  calculated  as  <Jlo,i(Nf)  A;A ; L(J  •  The  analysis  was  conducted  four 

separate  times  with  40  particles,  40  iterations.  A  summary  of  the  results  is  shown  in  Table  8. 

_ Table  8.  Results  (Case  C-2) _ 


Analysis 

1 

2 

3 

4 

At, 

0 

0 

0 

0 

A, 

4 

4 

4 

4 

At, 

0 

0 

0 

0 

Dr 

g4 

0 

0 

0 

0 

rrorig 

0.023 

0.023 

0.023 

0.023 

aNf 

rr°P‘ 

0.018 

0.018 

0.018 

0.018 

aNf 

%  Reduction 

22% 

22% 

22% 

22% 

^  At,  At, 

$19,240 

$19,240 

$19,240 

$19,240 

The  reduction  was  approximately  22%  in  each  of  the  analysis.  The  maximum  additional  experiments  were 
allocated  in  Group  G\  to  obtain  this  reduction.  The  PDF  of  <J)oa(Nf)  is  shown  in  Figure  4.  The  red  area  with  black 

dashed  line  represents  the  PDF  of  <7l[lg(N  y  before  adding  any  data  and  the  blue  area  represents  the  PDF  after  adding 

the  optimal  experimental  data. 
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A  study  on  the  sensitivity  of  the  fretting  fatigue  random  variables  was  developed  by  Golden  et  al.  [5],  They 
showed  that  friction  coefficient  and  partial  slip  slope,  group  G,,  was  dominant,  followed  by  the  pad  profile,  group 
G  They  concluded  that  the  traditional  variables  in  fatigue,  the  crack  growth  rate,  Gr  and  initial  crack  size,  Gx, 
were  not  significant  in  terms  of  the  contribution  to  the  output  variance.  Instead,  the  friction  coefficient  and  partial 
slip  slope  were  dominant  as  expected  in  a  fretting  fatigue  analysis.  This  finding  supports  in  part  the  results  obtained 
with  the  optimal  allocation  methodology,  in  which  every  analysis  returned  the  allocation  of  additional  experimental 
data  to  group  G2,  friction  coefficient  and  partial  slip  slope. 


V.  Conclusions 

Statistical  moments  obtained  from  simulation,  i.e.,  mean  and  standard  deviation  of  the  output,  often  involve 
significant  uncertainty  due  to  the  random  nature  of  the  input  variables.  In  reliability  analysis,  the  quantification  of 
uncertainty  is  of  vital  importance  for  decision-making,  where  the  decisions  may  be  affected  by  the  lack  of 
confidence  in  the  input  variables.  The  optimal  allocation  methodology  proposed  here  reduced  the  variance  of  the 
output  moments.  The  output  moments  were  calculated  using  the  input  population  moments,  which  were  simulated 
using  realizations  of  the  multivariate  t-distribution  and  Wishart  distribution. 

In  the  optimal  allocation  method,  the  variance  of  the  output  moments  may  be  reduced  by  allocating  resources  to 
obtain  more  experimental  data  of  the  input  variables  to  better  characterize  the  moments  of  the  input  probability 
density  function.  The  objective  of  the  optimization  model  was  to  minimize  the  standard  deviation  of  the  output 
moments,  where  the  number  of  additional  experiments  was  constrained  to  the  funds  available.  The  methodology 
combined  a  single-objective  optimization  algorithm  with  a  nested-loop  arrangement.  The  optimization  algorithm 
used  particle  swarm  optimization  (PSO)  modified  to  handle  integer  variables. 

A  fretting  fatigue  problem  was  explored  to  assess  additional  experiments  to  reduce  the  variance  in  the  mean  and 
standard  deviation  of  cycles  to  failure.  The  number  of  additional  experiments  to  add  for  each  random  variable 
necessary  to  reduce  the  standard  deviation  of  the  output  moments  depended  upon  several  factors:  the  number  of 
initial  data  points,  the  influence  of  the  input  variables,  the  cost  of  each  additional  experiment,  and  the  variance  of  the 
sample  mean. 

In  the  fretting  fatigue  example  the  results  found  by  Golden  et.  al  [5]  supported  the  results  of  the  optimal 
allocation  method.  The  optimal  allocation  methodology  can  be  used  as  a  tool  to  help  improve  the  confidence  of  the 
output  moments. 
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